Goto

Collaborating Authors

 observation time


On Observation Time for Recovering Latent Hawkes Networks

arXiv.org Machine Learning

Dynamics of interacting systems in engineering, society, and nature often evolve over latent networks that govern which entities can interact. We study the problem of inferring these networks from event-based observations, which arise naturally in finance, seismology, and neuroscience. While there is substantial algorithmic work addressing this important problem, theoretical results are scarce. In this paper we ask the following fundamental question: what is the minimum time that one must observe the dynamics in order to exactly recover the underlying network, as a function of the number $d$ of interacting entities? For a class of stationary Hawkes processes with sparse, weak interactions, we prove that an observation time of order $\log d$ is sufficient and necessary. For the upper bound we construct a two-stage estimator that uses clipped and binned event data for screening, followed by a least-squares refinement, and apply concentration bounds derived from the Poisson cluster representation. For the lower bound we combine Fano's inequality with Jacod's Girsanov formula for point processes on a suitable subclass of networks.


Variational Smoothing and Inference for SDEs from Sparse Data with Dynamic Neural Flows

arXiv.org Machine Learning

Stochastic differential equations (SDEs) provide a flexible framework for modeling temporal dynamics in partially observed systems. A central task is to calibrate such models from data, which requires inferring latent trajectories and parameters from sparse, noisy observations. Classical smoothing methods for this problem are often limited by path degeneracy and poor scalability. In this work, we developed a novel method based on characterization of the posterior SDE in terms of conditional backward-in-time score defined as the gradient of a function solving a Kolmogorov backward equation with multiplicative updates at observation times. We learn this conditional score using neural networks trained to satisfy both the governing PDE and the observation-induced jump conditions, thereby integrating continuous-time dynamics with discrete Bayesian updates. The resulting score induces a posterior SDE with the same diffusion coefficient but a modified drift, enabling efficient posterior trajectory sampling. We further derive a likelihood-based objective for learning the SDE parameters, yielding an evidence lower bound (ELBO) for joint state smoothing and parameter estimation. This leads to a variational EM-style procedure, where the neural conditional score is optimized to approximate the smoothing distribution, followed by a maximization step over the SDE parameters using samples from the induced posterior. Experiments on nonlinear systems demonstrate accurate and stable inference with a very few observations demonstrating significant improved scalability compared to classical MCMC methods.




Scaling up Continuous-Time Markov Chains Helps Resolve Underspecification

Neural Information Processing Systems

Modeling the time evolution of discrete sets of items (e.g., genetic mutations) is a fundamental problem in many biomedical applications. We approach this problem through the lens of continuous-time Markov chains, and show that the resulting learning task is generally underspecified in the usual setting of cross-sectional data. We explore a perhaps surprising remedy: including a number of additional independent items can help determine time order, and hence resolve underspecifi-cation. This is in sharp contrast to the common practice of limiting the analysis to a small subset of relevant items, which is followed largely due to poor scaling of existing methods. To put our theoretical insight into practice, we develop an approximate likelihood maximization method for learning continuous-time Markov chains, which can scale to hundreds of items and is orders of magnitude faster than previous methods. We demonstrate the effectiveness of our approach on synthetic and real cancer data.


Bayesian Decentralized Decision-making for Multi-Robot Systems: Sample-efficient Estimation of Event Rates

arXiv.org Artificial Intelligence

Abstract-- Effective collective decision-making in swarm robotics often requires balancing exploration, communication and individual uncertainty estimation, especially in hazardous environments where direct measurements are limited or costly. We propose a decentralized Bayesian framework that enables a swarm of simple robots to identify the safer of two areas, each characterized by an unknown rate of hazardous events governed by a Poisson process. Robots employ a conjugate prior to gradually predict the times between events and derive confidence estimates to adapt their behavior . Our simulation results show that the robot swarm consistently chooses the correct area while reducing exposure to hazardous events by being sample-efficient. Compared to baseline heuristics, our proposed approach shows better performance in terms of safety and speed of convergence. The proposed scenario has potential to extend the current set of benchmarks in collective decision-making and our method has applications in adaptive risk-aware sampling and exploration in hazardous, dynamic environments. Collective decision-making under uncertainty is a fundamental challenge in multi-robot systems, including domains such as collective perception, environment classification, and spatial consensus [1]-[4]. Decentralized systems (e.g., robot swarms) operate under strict limitations on sensing, communication, and memory. Instead of sharing/storing complete observation histories, robots must maintain compact model representations of their knowledge. It is crucial to develop efficient strategies for collective decision-making, especially when observations are sparse, noisy [5], and gathered from stochastic processes [6]. This is typically characterized as a best-of-n problem [3], [7].


High-dimensional Bayesian filtering through deep density approximation

arXiv.org Machine Learning

In this work, we benchmark two recently developed deep density methods for nonlinear filtering. Starting from the Fokker--Planck equation with Bayes updates, we model the filtering density of a discretely observed SDE. The two filters: the deep splitting filter and the deep BSDE filter, are both based on Feynman--Kac formulas, Euler--Maruyama discretizations and neural networks. The two methods are extended to logarithmic formulations providing sound and robust implementations in increasing state dimension. Comparing to the classical particle filters and ensemble Kalman filters, we benchmark the methods on numerous examples. In the low-dimensional examples the particle filters work well, but when we scale up to a partially observed 100-dimensional Lorenz-96 model the particle-based methods fail and the logarithmic deep density method prevails. In terms of computational efficiency, the deep density methods reduce inference time by roughly two to five orders of magnitude relative to the particle-based filters.


Neural Jump ODEs as Generative Models

arXiv.org Machine Learning

In this work, we explore how Neural Jump ODEs (NJODEs) can be used as generative models for Itô processes. Given (discrete observations of) samples of a fixed underlying Itô process, the NJODE framework can be used to approximate the drift and diffusion coefficients of the process. Under standard regularity assumptions on the Itô processes, we prove that, in the limit, we recover the true parameters with our approximation. Hence, using these learned coefficients to sample from the corresponding Itô process generates, in the limit, samples with the same law as the true underlying process. Compared to other generative machine learning models, our approach has the advantage that it does not need adversarial training and can be trained solely as a predictive model on the observed samples without the need to generate any samples during training to empirically approximate the distribution. Moreover, the NJODE framework naturally deals with irregularly sampled data with missing values as well as with path-dependent dynamics, allowing to apply this approach in real-world settings. In particular, in the case of path-dependent coefficients of the Itô processes, the NJODE learns their optimal approximation given the past observations and therefore allows generating new paths conditionally on discrete, irregular, and incomplete past observations in an optimal way.